Hyperspaces for Object Clustering and Approximate Matching in Peer-to-Peer Overlays

نویسندگان

  • Bernard Wong
  • Ymir Vigfusson
  • Emin Gün Sirer
چکیده

Existing distributed hash tables provide efficient mechanisms for storing and retrieving a data item based on an exact key, but are unsuitable when the search key is similar, but not identical, to the key used to store the data item. In this paper, we present a scalable and efficient peerto-peer system with a new search primitive that can efficiently find the k data items with keys closest to the search key. The system works via a novel assignment of virtual coordinates to each object in a high-dimensional, synthetic space such that the proximity between two points in the coordinate space is correlated with the similarity between the strings that the points represent. We examine the feasibility of this approach for efficient, peer-to-peer search on inexact string keys, and show that the system provides a robust method to handle key perturbations that naturally occur in applications, such as file-sharing networks, where the query strings are provided by users.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Matching for Peer-to-Peer Overlays with Cubit

Keyword search is a critical component in most content retrieval systems. Despite the emergence of completely decentralized and efficient peer-to-peer techniques for content distribution, there have not been similarly efficient, accurate, and decentralized mechanisms for content discovery based on approximate search keys. In this paper, we present a scalable and efficient peer-to-peer system ca...

متن کامل

Optimizations for Locality-Aware Structured Peer-to-Peer Overlays

We present several optimizations aimed at improving the object location performance of locality-aware structured peer-to-peer overlays. We present simulation results that demonstrate the effectiveness of these optimizations in Tapestry, and discuss their usage of the overall storage resources of the system.

متن کامل

A least resistance path to the analysis of unstructured overlay networks

Unstructured overlay networks for peer-to-peer applications combined with stochastic algorithms for interest-based clustering and resource location are attractive due to low-maintenance costs and inherent fault-tolerance properties. Moreover, there is a relatively large volume of experimental evidence that these methods are efficiency-wise a good alternative to structured methods, which require...

متن کامل

Self-Healing Protocols for Connectivity Maintenance in Unstructured Overlays

In this paper, we discuss on the use of self-organizing protocols to improve the reliability of dynamic Peer-to-Peer (P2P) overlay networks. Two similar approaches are studied, which are based on local knowledge of the nodes’ 2nd neighborhood. The first scheme is a simple protocol requiring interactions among nodes and their direct neighbors. The second scheme adds a check on the Edge Clusterin...

متن کامل

Understanding the Practical Limits of the Gnutella P2P System: An Analysis of Query Terms and Object Name Distributions

A number of prior efforts analyzed the behavior of popular peer-to-peer (P2P) systems and proposed ways for maintaining the overlays as well as methods for searching for contents using these overlays. However, little was known about how successful users could be in locating the shared objects in these system. There might be a mismatch between the way content creators named objects and the way s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007